Notebook 4: Machine learning analysis

1 Overview

In this notebook we analyse the output machine learning results from running the bandle classifier for differential localisation. We follow the vignettes in the bandle package in Bioconductor.

1.1 Install and load packages

Install R packages needed for data processing -

if (!require("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}

BiocManager::install(c("pRoloc",
                       "MSnbase",
                       "pRolocGUI",
                       "tidyverse",
                       "patchwork",
                       "gridExtra",
                       "colorspace",
                       "gplots",
                       "kableExtra",
                       "here",
                       "VennDiagram",
                       "eulerr",
                       "Rtsne",
                       "here"))

Load packages

library("pRoloc")
library("MSnbase")
library("pRolocGUI")
library("tidyverse")
library("patchwork")
library("gridExtra")
library("colorspace")
library("gplots")
library("kableExtra")
library("Rtsne")
library("here")
library("VennDiagram")
library("eulerr")
library("RColorBrewer")
library("bandle")
library("here")

1.2 Load functions

source(here("R/prettymap.R"))
source(here("R/bandle_outliers.R"))
## 
## Attaching package: 'gtools'
## The following object is masked from 'package:futile.logger':
## 
##     scat

2 Processing BANDLE output

2.1 Load BANDLE output

The bandle output can be found on Zenodo at DOI xxx as it is too big for this repo.

load(here("output/adipocyte_lopit_bandle_input.rda"))
load(here("output/adipocyte_lopit_bandle_output.rda"))
bandle_output <- bandleProcess(bandle_output)

2.2 Assessing convergence

The bandlea package used Markov chain Monte Carlo (MCMC) sampling. In MCMC methods "chains" are drawn of random samples from the unknown posterior where the current value depends on the previously drawn value (but not on values before that). Once the chain has converged, its elements can be seen as a sample from the target posterior probability distribution. A common first assessment of whether chains have converged is to assess trace and density plots. We use theplotOutliersfunction (which uses code from the [coda` package](https://cran.r-project.org/web/packages/coda/index.html)) to generate trace and density plots for each replicate and each condition to give us an idea of convergence.

plotOutliers(bandle_output)

We further examine the Gelman statistics to give us more information on convergence.

calculateGelman(bandle_output)
## [[1]]
##             comb_12  comb_13  comb_14   comb_23   comb_24   comb_34
## Point_Est 0.9994007 0.999833 1.000146 0.9995404 0.9994910 0.9995184
## Upper_CI  0.9996037 1.000044 1.000569 1.0003462 0.9995321 1.0007017
## 
## [[2]]
##            comb_12  comb_13  comb_14  comb_23  comb_24  comb_34
## Point_Est 1.001804 1.000069 1.002812 1.000302 1.001237 1.000168
## Upper_CI  1.008455 1.003545 1.016536 1.000873 1.002326 1.003551

The trace and density plots look as expected and the Gelman statistics are all < 1.2, thus we take all 4 chains forward for our analysis.

2.3 Extracting the bandle results

We use bandlePredict to append results to original MSnSet. Note, results are appended to first replicate of each condition in the MSnSet data structure.

res <- bandlePredict(objectCond1 =  basal, 
                     objectCond2 =  insulin, 
                     params = bandle_output, 
                     fcol = "markers")

res_basal <- res[[1]]
res_insulin <- res[[2]]
# Update labels of each replicate
for (i in seq(res_basal)) {
  res_basal[[i]] <- updateFvarLabels(res_basal[[i]],
                                     label = paste0("rep", i),
                                     sep = "_")
  res_insulin[[i]] <- updateFvarLabels(res_insulin[[i]],
                                       label = paste0("rep", i),
                                       sep = "_")
}

# Combine into MSnSet
res_basal_p <- MSnbase::combine(res_basal[[1]], res_basal[[2]])
res_basal_p <- MSnbase::combine(res_basal_p, res_basal[[3]])
res_insulin_p <- MSnbase::combine(res_insulin[[1]], res_insulin[[2]])
res_insulin_p <- MSnbase::combine(res_insulin_p, res_insulin[[3]])

3 Exploring the allocations and outliers

BANDLE gives us the allocation probability (the mean posterior probability for the master protein subcellular allocations computed by TAGM-MCMC), and the outlier probability (the posterior probability for the protein to belong to the outlier component rather than any of the annotated components in the basal LOPIT data).

Below we threshold on both of these values, to remove allocations from proteins that have a low allocation probability or high probability of being an outlier to make our allocations more robust and reduce over-allocation to subcellular compartments. Proteins where the allocation or outlier probability value does not meet the threshold are left unannotated and given “unknown” localisation.

3.1 Basal results

## get proteins NOT used as markers
basal_un_msnset <- unknownMSnSet(res_basal_p, fcol = "markers_rep1")

## plot BANDLE allocation probability
nt_prob_b <- tapply(fData(basal_un_msnset)[, "bandle.probability_rep1"],
                    fData(basal_un_msnset)[, "bandle.allocation_rep1"], 
                    summary)
boxplot(nt_prob_b, las = 2, main = "BANDLE probabilities by organelle")

The closer the outlier probability is to 1, the more likely it is we have outliers for that organelle class

nt_out_b <- tapply(fData(basal_un_msnset)[, "bandle.outlier_rep1"], 
                 fData(basal_un_msnset)[, "bandle.allocation_rep1"], 
                 summary)
boxplot(nt_out_b, las = 2, main = "BANDLE outlier component prob. by organelle")

Here we create a new column where we invert outlier probability (so the closer the outlier probability is to 1, the more likely it is we have outliers for that organelle class).

## create a new column for the outlier probability
fData(res_basal_p)$outlier.prob <- 1 - fData(res_basal_p)$bandle.outlier_rep1
fData(basal_un_msnset)$outlier.prob <- 1 - fData(basal_un_msnset)$bandle.outlier_rep1

nt_out_b <- tapply(fData(basal_un_msnset)[, "outlier.prob"], 
                 fData(basal_un_msnset)[, "bandle.allocation_rep1"], 
                 summary)
# boxplot(nt_out_b, las = 2, main = "BANDLE (1 - outlier) component prob. by organelle: Basal")

3.1.1 Thresholding on allocation probability only (“low confidence”)

Fist we threshold on allocation probability.

fcol: name of the prediction column in the feature data
scol: name of the prediction score column
mcol: name of the feature meta data containing the labelled training data (doesn’t have bandle allocation/probability scores)
t: the score threshold, everything with score <t are set to ‘unknown’

# here we set the allocation to "unknown" for all proteins where the allocation probability is <0.9)
res_basal_p <- getPredictions(res_basal_p, 
                              fcol = "bandle.allocation_rep1", 
                              scol = "bandle.probability_rep1",
                              mcol = "markers_rep1",
                              t = 0.9)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   611                   482                   251 
##              Lysosome         Mitochondrion               Nucleus 
##                   106                   633                   656 
##            Peroxisome       Plasma membrane            Proteasome 
##                    44                   441                   137 
##              Ribosome               unknown 
##                   556                    88

3.1.2 Thresholding on allocation and outlier probability (“high confidence”)

As the outlier probability varies a lot across organelles, we threshold each one individually, based on it’s median outlier probability score.

summary_df_b <- data.frame(
  Organelle = names(nt_out_b),
  Min = sapply(nt_out_b, function(x) x["Min."]),
  Q1 = sapply(nt_out_b, function(x) x["1st Qu."]),
  Median = sapply(nt_out_b, function(x) x["Median"]),
  Mean = sapply(nt_out_b, function(x) x["Mean"]),
  Q3 = sapply(nt_out_b, function(x) x["3rd Qu."]),
  Max = sapply(nt_out_b, function(x) x["Max."]),
  row.names = NULL
)
print(summary_df_b)
##                Organelle Min      Q1 Median      Mean Q3 Max
## 1                Cytosol   0 0.00000 0.7479 0.5094689  1   1
## 2  Endoplasmic reticulum   0 1.00000 1.0000 0.8559327  1   1
## 3        Golgi apparatus   0 0.00000 0.1470 0.4906198  1   1
## 4               Lysosome   0 0.01035 1.0000 0.7148651  1   1
## 5          Mitochondrion   0 1.00000 1.0000 0.9290889  1   1
## 6                Nucleus   0 0.00000 0.0000 0.1714740  0   1
## 7             Peroxisome   0 1.00000 1.0000 0.8568733  1   1
## 8        Plasma membrane   0 0.00000 0.0102 0.4506814  1   1
## 9             Proteasome   0 0.00000 1.0000 0.5890717  1   1
## 10              Ribosome   0 0.00000 0.0000 0.4147032  1   1
t_user <- c(0, 0, 0, 0, 0, 0.5, 0, 0, 0, 0.5, 0)
names(t_user) <- c(getMarkerClasses(res_basal_p, fcol = "markers_rep1"), "unknown")

res_basal_p <- getPredictions(res_basal_p, 
                              fcol = "bandle.allocation_rep1.pred", 
                              scol = "outlier.prob",
                              mcol = "markers_rep1",
                              t = t_user)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   611                   482                   251 
##              Lysosome         Mitochondrion               Nucleus 
##                   106                   633                   145 
##            Peroxisome       Plasma membrane            Proteasome 
##                    44                   441                   137 
##              Ribosome               unknown 
##                   270                   885
# if median outlier prob = 0 then threshold = 1, else threshold = median
## proteins allocated to an organelle with an outlier probability of <threshold are changed to "unknown"
t_user <- c(0.7479, 1, 0.1470, 1, 1, 1, 1, 0.0102, 1, 1, 0)
names(t_user) <- c(getMarkerClasses(res_basal_p, fcol = "markers_rep1"), "unknown")

res_basal_p <- getPredictions(res_basal_p, 
                     fcol = "bandle.allocation_rep1.pred.pred", 
                     scol = "outlier.prob",
                     mcol = "markers_rep1",
                     t = t_user)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   331                   408                   138 
##              Lysosome         Mitochondrion               Nucleus 
##                    80                   600                   129 
##            Peroxisome       Plasma membrane            Proteasome 
##                    39                   238                    87 
##              Ribosome               unknown 
##                   236                  1719

3.1.3 Basal: Visualisation of the allocation results

pca_res_basal_p <- plot2D(res_basal_p, fcol = "markers_rep1", plot = FALSE)

par(mfrow = c(3,2))
par(mar = c(6, 6, 6, 2))
prettyMap_overlay(pca_res_basal_p, res_basal_p, 
                  main = "Basal: markers", 
                  fcol = "markers_rep1")
prettyMap_overlay(pca_res_basal_p, res_basal_p, 
                  main = "Basal: bandle protein allocation", 
                  fcol = "bandle.allocation_rep1")
prettyMap_overlay(pca_res_basal_p, res_basal_p, 
                  main = "Basal: 'low' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred")
prettyMap_overlay(pca_res_basal_p, res_basal_p, 
                  main = "Basal: 'medium' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred.pred")
prettyMap_overlay(pca_res_basal_p, res_basal_p, 
                  main = "Basal: 'high' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred.pred.pred")

plot(NULL, xaxt='n',yaxt='n',bty='n',ylab='',xlab='', xlim=0:1, ylim=0:1)
addLegend(res_basal_p, fcol = "markers_rep1", where = "center", cex = 1, ncol = 2, y.intersp = 1.2)

3.2 Insulin results

## get proteins NOT used as markers
insulin_un_msnset <- unknownMSnSet(res_insulin_p, fcol = "markers_rep1")

## plot BANDLE allocation probability
nt_prob_ins <- tapply(fData(insulin_un_msnset)[, "bandle.probability_rep1"], 
             fData(insulin_un_msnset)[, "bandle.allocation_rep1"], 
             summary)
boxplot(nt_prob_ins, las = 2, main = "BANDLE probabilities by organelle")

The closer the outlier probability is to 1, the more likely it is we have outliers for that organelle class

nt_out_ins <- tapply(fData(insulin_un_msnset)[, "bandle.outlier_rep1"], 
                 fData(insulin_un_msnset)[, "bandle.allocation_rep1"], 
                 summary)
boxplot(nt_out_ins, las = 2, main = "BANDLE outlier component prob. by organelle")

Again, we create a new column where we invert outlier probability (so the closer the outlier probability is to 1, the more likely it is we have outliers for that organelle class). This allows us to use the getPredictions function to threshold on this value.

## create a new column for the outlier probability
fData(res_insulin_p)$outlier.prob <- 1 - fData(res_insulin_p)$bandle.outlier_rep1
fData(insulin_un_msnset)$outlier.prob <- 1 - fData(insulin_un_msnset)$bandle.outlier_rep1

## re-plot outlier probability (now the closer to 0 the more likely we have outliers for this class)
nt_out_ins <- tapply(fData(insulin_un_msnset)[, "outlier.prob"], 
                 fData(insulin_un_msnset)[, "bandle.allocation_rep1"], 
                 summary)

3.2.1 Thresholding on allocation probability only (“low confidence”)

Fist we threshold on allocation probability.

fcol: name of the prediction column in the feature data
scol: name of the prediction score column
mcol: name of the feature meta data containing the labelled training data (doesn’t have bandle allocation/probability scores)
t: the score threshold, everything with score <t are set to ‘unknown’

# here we set the allocation to "unknown" for all proteins where the allocation probability is <0.9)
res_insulin_p <- getPredictions(res_insulin_p, 
                     fcol = "bandle.allocation_rep1", 
                     scol = "bandle.probability_rep1",
                     mcol = "markers_rep1",
                     t = 0.9)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   551                   548                   159 
##              Lysosome         Mitochondrion               Nucleus 
##                   101                   621                  1086 
##            Peroxisome       Plasma membrane            Proteasome 
##                    43                   365                   120 
##              Ribosome               unknown 
##                   341                    70

3.2.2 Thresholding on allocation and outlier probability (“high confidence”)

As the outlier probability varies a lot across organelles, we threshold each one individually, based on it’s median outlier probability score.

summary_df_ins <- data.frame(
  Organelle = names(nt_out_ins),
  Min = sapply(nt_out_ins, function(x) x["Min."]),
  Q1 = sapply(nt_out_ins, function(x) x["1st Qu."]),
  Median = sapply(nt_out_ins, function(x) x["Median"]),
  Mean = sapply(nt_out_ins, function(x) x["Mean"]),
  Q3 = sapply(nt_out_ins, function(x) x["3rd Qu."]),
  Max = sapply(nt_out_ins, function(x) x["Max."]),
  row.names = NULL
)
print(summary_df_ins)
# if median outlier prob = 0 (highest chance of having outliers) then threshold = 0.5, else threshold = 0
## so in organelles with a high chance of having outliers, we remove allocations from proteins in these organelles which have a >50% chance of being an outlier.
### proteins allocated to Nucleus with an outlier probability of <0.5 are changed to "unknown"
t_user_ins <- c(0, 0, 0, 0, 0, 0.5, 0, 0, 0, 0, 0)
names(t_user_ins) <- c(getMarkerClasses(res_insulin_p, fcol = "markers_rep1"), "unknown")
res_insulin_p <- getPredictions(res_insulin_p, 
                     fcol = "bandle.allocation_rep1.pred", 
                     scol = "outlier.prob",
                     mcol = "markers_rep1",
                     t = t_user_ins)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   551                   548                   159 
##              Lysosome         Mitochondrion               Nucleus 
##                   101                   621                   171 
##            Peroxisome       Plasma membrane            Proteasome 
##                    43                   365                   120 
##              Ribosome               unknown 
##                   341                   985
# if median outlier prob = 0 then threshold = 1, else threshold = median
## proteins allocated to an organelle with an outlier probability of <threshold are changed to "unknown"
t_user_ins <- c(1, 1, 1, 1, 1, 1, 1, 0.9014, 1, 0.0372, 0)
names(t_user_ins) <- c(getMarkerClasses(res_insulin_p, fcol = "markers_rep1"), "unknown")
res_insulin_p <- getPredictions(res_insulin_p, 
                     fcol = "bandle.allocation_rep1.pred.pred", 
                     scol = "outlier.prob",
                     mcol = "markers_rep1",
                     t = t_user_ins)
## ans
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##                   302                   415                    95 
##              Lysosome         Mitochondrion               Nucleus 
##                    77                   586                   147 
##            Peroxisome       Plasma membrane            Proteasome 
##                    37                   205                    79 
##              Ribosome               unknown 
##                   202                  1860

3.2.3 Insulin: Visualisation of the allocation results

pca_res_insulin_p <- plot2D(res_insulin_p, fcol = "markers_rep1", plot = FALSE)

par(mfrow = c(3, 2))
par(mar = c(6, 6, 6, 2))
prettyMap_overlay(pca_res_insulin_p, res_insulin_p, 
                  main = "Insulin: markers", 
                  fcol = "markers_rep1")
prettyMap_overlay(pca_res_insulin_p, res_insulin_p, 
                  main = "Insulin: bandle protein allocation", 
                  fcol = "bandle.allocation_rep1")
prettyMap_overlay(pca_res_insulin_p, res_insulin_p, 
                  main = "Insulin: 'low' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred")
prettyMap_overlay(pca_res_insulin_p, res_insulin_p, 
                  main = "Insulin: 'medium' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred.pred")
prettyMap_overlay(pca_res_insulin_p, res_insulin_p, 
                  main = "Insulin: 'high' outlier threshold", 
                  fcol = "bandle.allocation_rep1.pred.pred.pred")

4 Differential localisation

The differential localisation probability tells us which proteins are most likely to differentially localise, and therefore exhibit a change in their steady-state subcellular location. The differential localisation probability is found in the bandle.differential.localisation column of the bandleParamsoutput.

# extract all the diff loc probabilities
pe1 <- summaries(bandle_output)[[1]]@posteriorEstimates
pe2 <- summaries(bandle_output)[[2]]@posteriorEstimates
diffloc_probs <- pe1$bandle.differential.localisation
par(mar = c(5, 5, 6, 2))
plot(diffloc_probs[order(diffloc_probs, decreasing = TRUE)],
     col = getStockcol()[2], pch = 1, 
     ylab = "Differential localisation probability",
     xlab = "Protein rank", 
     main = "Differential localisation rank plot", 
     xlim = c(0,3500), ylim = c(0,1))

We can examine the number of differentially localising proteins we have according to probability.

# Define probability threshold
thresholds <- c(1, .95, seq(.9, .1, by = -.1))

# Count proteins meeting each threshold
counts <- sapply(thresholds, function(threshold) sum(diffloc_probs >= threshold))

# Print the result
result_df <- data.frame(Threshold = thresholds, Count = counts)
print(result_df)
##    Threshold Count
## 1       1.00   502
## 2       0.95   596
## 3       0.90   621
## 4       0.80   650
## 5       0.70   665
## 6       0.60   674
## 7       0.50   691
## 8       0.40   707
## 9       0.30   720
## 10      0.20   739
## 11      0.10   761

4.1 Candidates

We define all proteins with a differential localisation probability = 1 as candidates.

ind <- which(fData(res_basal_p)$bandle.differential.localisation_rep1 == 1)
candidates <- featureNames(res_basal_p)[ind]
print(candidates)
##   [1] "A2AHC3" "A2ALW5" "A2RSY6" "A3KGB4" "B2RX14" "B2RY04" "D3YZV8" "E9PYK3"
##   [9] "O08532" "O08582" "O08648" "O08759" "O08919" "O09012" "O09164" "O35245"
##  [17] "O35326" "O35344" "O54781" "O54950" "O54984" "O55029" "O55102" "O55135"
##  [25] "O55222" "O70493" "O70566" "O88447" "O88520" "O88544" "O88597" "O88796"
##  [33] "O88878" "O88983" "O89079" "O89116" "P04627" "P08122" "P08226" "P08752"
##  [41] "P09405" "P0C0A3" "P11103" "P12023" "P14142" "P14873" "P15105" "P15535"
##  [49] "P17809" "P18052" "P19096" "P19182" "P19426" "P21107" "P21460" "P23116"
##  [57] "P23506" "P23949" "P23950" "P27612" "P28650" "P29351" "P29391" "P30999"
##  [65] "P32507" "P34152" "P35569" "P39053" "P39054" "P40237" "P40336" "P40694"
##  [73] "P42932" "P51954" "P52633" "P56212" "P56380" "P56501" "P56959" "P56960"
##  [81] "P58252" "P58774" "P58871" "P59048" "P60229" "P60487" "P60521" "P60766"
##  [89] "P60904" "P61202" "P61924" "P62627" "P62748" "P62869" "P63024" "P63028"
##  [97] "P63037" "P63330" "P67871" "P68373" "P70122" "P70280" "P70362" "P70662"
## [105] "P70663" "P80316" "P80317" "P81122" "P97313" "P97390" "P97434" "P97499"
## [113] "P97742" "P97819" "P97820" "P98195" "Q00519" "Q02248" "Q02819" "Q03958"
## [121] "Q05512" "Q05BC3" "Q05D44" "Q06138" "Q07113" "Q08093" "Q2KN98" "Q2TPA8"
## [129] "Q3B7Z2" "Q3KNM2" "Q3SXD3" "Q3TDD9" "Q3TFD2" "Q3TLD5" "Q3TPE9" "Q3TWW8"
## [137] "Q3U0B3" "Q3U0V1" "Q3U1C6" "Q3U2P1" "Q3U6N9" "Q3UGC7" "Q3UGP9" "Q3UH60"
## [145] "Q3UHC7" "Q3UHX2" "Q3UIW5" "Q3UJD6" "Q3UMY5" "Q3UQN2" "Q3UU96" "Q3UVL4"
## [153] "Q3V009" "Q45VK7" "Q4VAA2" "Q4VBF2" "Q4VGL6" "Q5BLK4" "Q5EG47" "Q5H8C4"
## [161] "Q5NCX5" "Q5SNZ0" "Q5SRX1" "Q5XJY5" "Q60575" "Q60631" "Q60668" "Q60864"
## [169] "Q61037" "Q61187" "Q61194" "Q61559" "Q61696" "Q61768" "Q61823" "Q62093"
## [177] "Q62130" "Q62188" "Q62465" "Q64373" "Q64514" "Q64735" "Q69ZP3" "Q69ZS7"
## [185] "Q6A009" "Q6A065" "Q6DVA0" "Q6GQR8" "Q6NZF1" "Q6P073" "Q6P542" "Q6P5E6"
## [193] "Q6P8I4" "Q6P8X1" "Q6P9R4" "Q6P9S0" "Q6PCP5" "Q6PDH0" "Q6PDM2" "Q6PF93"
## [201] "Q6Y685" "Q6ZWQ0" "Q6ZWY3" "Q71FD5" "Q78PG9" "Q7TMF2" "Q7TPD0" "Q7TPH6"
## [209] "Q7TPM9" "Q80SY4" "Q80SY5" "Q80TH2" "Q80TL7" "Q80TM6" "Q80TM9" "Q80TY0"
## [217] "Q80U58" "Q80WC7" "Q80X41" "Q80X90" "Q80YE7" "Q80YR4" "Q810B6" "Q810J8"
## [225] "Q810U5" "Q8BG30" "Q8BG60" "Q8BGF3" "Q8BGI4" "Q8BGK9" "Q8BGR2" "Q8BGT6"
## [233] "Q8BHL3" "Q8BHL5" "Q8BI72" "Q8BIG7" "Q8BIJ7" "Q8BIK4" "Q8BJH1" "Q8BJS4"
## [241] "Q8BJY1" "Q8BK03" "Q8BK30" "Q8BK63" "Q8BK67" "Q8BKX1" "Q8BL48" "Q8BM55"
## [249] "Q8BMJ3" "Q8BPB0" "Q8BPM2" "Q8BPT6" "Q8BR63" "Q8BTM8" "Q8BUE4" "Q8BUK6"
## [257] "Q8BVI5" "Q8BVU0" "Q8BVU5" "Q8BW10" "Q8BW41" "Q8BWR4" "Q8BX90" "Q8BX94"
## [265] "Q8C0D4" "Q8C0J2" "Q8C0T5" "Q8C129" "Q8C1F4" "Q8C3X8" "Q8C4Y3" "Q8C6E0"
## [273] "Q8C6M1" "Q8C754" "Q8C8T8" "Q8C9B9" "Q8CB77" "Q8CBE3" "Q8CCB4" "Q8CCP0"
## [281] "Q8CD15" "Q8CFH6" "Q8CG48" "Q8CG76" "Q8CG79" "Q8CGC6" "Q8CI33" "Q8CI71"
## [289] "Q8CI95" "Q8CIE6" "Q8CIF4" "Q8CJ53" "Q8JZQ9" "Q8K013" "Q8K021" "Q8K124"
## [297] "Q8K1A6" "Q8K1C0" "Q8K245" "Q8K268" "Q8K2A1" "Q8K2C7" "Q8K3B1" "Q8K3I9"
## [305] "Q8K409" "Q8K4L3" "Q8QZY1" "Q8R0S2" "Q8R0W0" "Q8R0Y6" "Q8R1A4" "Q8R1B4"
## [313] "Q8R323" "Q8R409" "Q8R4H2" "Q8R550" "Q8VBZ0" "Q8VC48" "Q8VC85" "Q8VCR3"
## [321] "Q8VD65" "Q8VDG7" "Q8VDJ3" "Q8VDM6" "Q8VDP4" "Q8VDS8" "Q8VE97" "Q8VEG6"
## [329] "Q8VEJ9" "Q8VH51" "Q8VHX6" "Q8VIJ8" "Q91V64" "Q91V92" "Q91VX2" "Q91VZ6"
## [337] "Q91W89" "Q91WF7" "Q91WG5" "Q91WJ8" "Q91WK2" "Q91WQ3" "Q91WR3" "Q91XC8"
## [345] "Q91XI1" "Q91YN9" "Q91Z38" "Q91Z53" "Q91ZP3" "Q91ZR1" "Q91ZU6" "Q91ZW2"
## [353] "Q921F4" "Q921R8" "Q921T2" "Q922H9" "Q923B1" "Q99020" "Q99104" "Q99JP0"
## [361] "Q99JW4" "Q99JX4" "Q99JY4" "Q99JY8" "Q99K28" "Q99KH8" "Q99KR3" "Q99KU0"
## [369] "Q99KY4" "Q99L04" "Q99LD4" "Q99LG0" "Q99LH1" "Q99LJ6" "Q99LJ7" "Q99ME2"
## [377] "Q99MZ7" "Q99NH2" "Q99P88" "Q9CQ71" "Q9CQA1" "Q9CQC9" "Q9CQG2" "Q9CQJ2"
## [385] "Q9CQJ6" "Q9CQP2" "Q9CR26" "Q9CR60" "Q9CR95" "Q9CRA5" "Q9CW46" "Q9CXG3"
## [393] "Q9CXL3" "Q9CY18" "Q9CYI4" "Q9CZD5" "Q9CZT5" "Q9D032" "Q9D074" "Q9D0B6"
## [401] "Q9D0C4" "Q9D0D5" "Q9D0I4" "Q9D0I8" "Q9D0L8" "Q9D0T2" "Q9D0Z3" "Q9D154"
## [409] "Q9D2E2" "Q9D394" "Q9D3U0" "Q9D4F2" "Q9D5T0" "Q9D662" "Q9D6V8" "Q9D6W8"
## [417] "Q9D6Y7" "Q9D706" "Q9D7A8" "Q9D7N6" "Q9D7S9" "Q9D8C4" "Q9D8N2" "Q9D8X2"
## [425] "Q9D8Z1" "Q9D902" "Q9DB27" "Q9DBR1" "Q9DBR3" "Q9DC51" "Q9DCG9" "Q9DCH4"
## [433] "Q9DCR2" "Q9DD20" "Q9EP53" "Q9EP71" "Q9EPC1" "Q9EPJ9" "Q9EPK7" "Q9EQ80"
## [441] "Q9EQG7" "Q9ER00" "Q9ERR7" "Q9ERU9" "Q9ERV1" "Q9ES28" "Q9ET54" "Q9JHW2"
## [449] "Q9JI10" "Q9JI78" "Q9JIF7" "Q9JIS8" "Q9JJ28" "Q9JJ59" "Q9JJV2" "Q9JKK1"
## [457] "Q9JKN1" "Q9JL15" "Q9JLM8" "Q9JLQ0" "Q9JLQ2" "Q9JM13" "Q9JM52" "Q9JMK2"
## [465] "Q9QUI0" "Q9QUI1" "Q9QUJ7" "Q9QWY8" "Q9QXK3" "Q9QXN3" "Q9QXS1" "Q9QY06"
## [473] "Q9QYB5" "Q9QYC0" "Q9QYH6" "Q9QYI6" "Q9QZ06" "Q9QZD9" "Q9QZN4" "Q9QZQ1"
## [481] "Q9R0Y5" "Q9WTP2" "Q9WTZ0" "Q9WU40" "Q9WU60" "Q9WUM4" "Q9WV55" "Q9WV86"
## [489] "Q9WV95" "Q9WVA3" "Q9WVB0" "Q9WVB4" "Q9WVE8" "Q9Z129" "Q9Z1D1" "Q9Z1E4"
## [497] "Q9Z1G4" "Q9Z1T1" "Q9Z266" "Q9Z2R6" "Q9Z2X8" "Q9Z2Y8"

4.2 Visualisation

bandle_res_basal <- MSnSetList(list(res_basal_p[, 1:10],
                                    res_basal_p[, 2:20],
                                    res_basal_p[, 21:30])) 

bandle_res_insulin <- MSnSetList(list(res_insulin_p[, 1:10],
                                      res_insulin_p[, 2:20],
                                      res_insulin_p[, 21:30])) 

orgs <- c(union(getMarkerClasses(bandle_res_basal[[1]], "bandle.allocation_rep1.pred.pred.pred"), 
                getMarkerClasses(bandle_res_insulin[[1]], "bandle.allocation_rep1.pred.pred.pred")))

circos_cols <- c(getStockcol()[seq_along(orgs)], "grey")
(colscheme <- setNames(circos_cols, c(orgs, "unknown"))) 
##               Cytosol Endoplasmic reticulum       Golgi apparatus 
##             "#E41A1C"             "#377EB8"             "#309C17" 
##              Lysosome         Mitochondrion               Nucleus 
##             "#FF7F00"             "#FFD700"             "#00CED1" 
##            Peroxisome       Plasma membrane            Proteasome 
##             "#A65628"             "#F781BF"             "#984EA3" 
##              Ribosome               unknown 
##             "#9ACD32"                "grey"
subset_msnset <- list(bandle_res_basal[[1]][ind, ], 
                      bandle_res_insulin[[1]][ind, ])

Chord diagram

plotTranslocations(subset_msnset, fcol = "bandle.allocation_rep1.pred.pred.pred", col = colscheme, type = "chord")
## 502 features in common
## ------------------------------------------------
## If length(fcol) == 1 it is assumed that the
## same fcol is to be used for both datasets
## setting fcol = c(bandle.allocation_rep1.pred.pred.pred,bandle.allocation_rep1.pred.pred.pred)
## ----------------------------------------------

Alluvial

plotTranslocations(subset_msnset, fcol = "bandle.allocation_rep1.pred.pred.pred", col = colscheme, type = "alluvial")
## 502 features in common
## ------------------------------------------------
## If length(fcol) == 1 it is assumed that the
## same fcol is to be used for both datasets
## setting fcol = c(bandle.allocation_rep1.pred.pred.pred,bandle.allocation_rep1.pred.pred.pred)
## ----------------------------------------------

Table

plotTable(subset_msnset, fcol = "bandle.allocation_rep1.pred.pred.pred", all = TRUE)
## 502 features in common
## ------------------------------------------------
## If length(fcol) == 1 it is assumed that the
## same fcol is to be used for both datasets
## setting fcol = c(bandle.allocation_rep1.pred.pred.pred, bandle.allocation_rep1.pred.pred.pred)
## ----------------------------------------------
##               Condition1            Condition2 value
## 7                Cytosol               unknown     2
## 14 Endoplasmic reticulum       Plasma membrane     1
## 16 Endoplasmic reticulum               unknown     2
## 23       Golgi apparatus       Plasma membrane    11
## 25       Golgi apparatus               unknown     7
## 34         Mitochondrion               unknown    10
## 38       Plasma membrane Endoplasmic reticulum    18
## 39       Plasma membrane       Golgi apparatus     2
## 43       Plasma membrane               unknown    10
## 52              Ribosome               unknown     8
## 56               unknown Endoplasmic reticulum     3
## 57               unknown       Golgi apparatus     1
## 59               unknown       Plasma membrane     5
## 61               unknown               unknown   420
## 62               unknown              Lysosome     1
## 63               unknown            Proteasome     1